Model Selection

Video Q&A

# Video Q&A

Videochat R1 7B

VideoChat-R1_7B is a multimodal video understanding model based on Qwen2.5-VL-7B-Instruct, capable of processing video and text inputs and generating text outputs.

Transformers English

Llava Video 7B Qwen2

The LLaVA-Video model is a 7B-parameter multimodal model based on the Qwen2 language model, specializing in video understanding tasks and supporting 64-frame video input.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase